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ABSTRACT 

In mammalian genomes a sixth base, 
5-hydroxymethylcytosine ( hm C), is generated by en- 
zymatic oxidation of 5-methylcytosine ( m C). This dis- 
covery has raised fundamental questions about the 
functional relevance of hm C in mammalian genomes. 
Due to their very similar chemical structure, discrim- 
ination of the rare hm C against the far more 
abundant m C is technically challenging and to date 
no methods for direct sequencing of hm C have been 
reported. Here, we report on a purified recombinant 
endonuclease, PvuRtsll, which selectively cleaves 
hm C-containing sequences. We determined the 
consensus cleavage site of PvuRtsll as hm CNii_i 2 / 
Ng_ 10 G and show first data on its potential to inter- 
rogate hm C patterns in mammalian genomes. 

INTRODUCTION 

In higher eukaryotes, only the C5 position of genomic 
cytosine is subject to enzymatically catalyzed 
post-replicative modification. Methylation at this 
position has long been known to play major roles in epi- 
genetic control of transcriptional activity and, as a conse- 
quence, to affect fundamental processes such as 
development (including natural reprogramming of cell 
fate), imprinting, X chromosome inactivation, genome 
stability and predisposition to neoplastic transformation 
(1,2). The recent discovery of the further modification of 
5-methylcytosine ( m C) to 5-hydroxymethylcytosine ( hra C) 
by the family of Tet dioxygenases has raised major ques- 
tions on the functional relevance of this sixth base in 
mammalian genomes (3,4). While recent evidence 
supports a role for hm C as an intermediate in the erasure 
of cytosine methylation (5), other roles in controlling 



genomic functions cannot be excluded. The definition of 
these roles will require profiling of genomic hm C patterns, 
which presents a major technical challenge as hm C is struc- 
turally and chemically very similar to m C but in general far 
less abundant in mammalian genomes (3,4,6-9). The gold 
standard methodology for profiling of genomic m C sites, 
bisulfite conversion, cannot discriminate hm C from m C 
and all available restriction endonucleases are either 
equally sensitive to m C and hm C or not sensitive to 
either (10-12). While antibodies raised against hm C are 
commercially available, their use to probe m C frequency 
by DNA immunoprecipitation has yet to be reported and 
the accuracy of this method will depend on the relative 
affinity of these antibodies for hm C versus m C as the latter 
is present in large excess in mammalian genomes. Very 
recently enzymatic methods for selective labeling and 
identification of hm C have been reported (7,13). 

Interestingly, hm C is also present in the genomes of 
viruses that infect bacteria and unicellular algae, where 
it serves as protection against the restriction systems of 
the host. In particular, hm C accounts for up to 100% of 
the cytosine residues in the genomes of T-even coliphages. 
In these phages the hydroxymethyl group is added at the 
level of the dCMP precursor and further linked to glucose 
(in both a- and P-configurations) or gentiobiose after in- 
corporation of the nucleotide in the genome (14-16). We 
sought to exploit enzymatic activities that evolved as part 
of the struggle between bacteria and these viruses to se- 
lectively detect hm C in mammalian genomes. Recently, we 
described an assay for quantification of global genomic 
hm C levels based on the transfer of tritiated glucose to 
hm C by T4 P-glucosyltransferase (7). Interestingly, restric- 
tion systems have evolved in bacteria that address the 
phage counter defense measures by specifically recognizing 
modified cytosine. Among these the McrBC system and 
the recently described MspJI endonuclease recognize se- 
quences containing both m C and hm C (17,18) and 
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therefore per se are not useful to discriminate these 
modified cytosines. At least two endonucleases, 
PvuRtslI and GmrSD, were shown to restrict DNA con- 
taining glucosylated hm C (19,20). However, GmrSD does 
not cleave non-glucosylated ( hm C-containing) T4 DNA, 
has the additional disadvantages of being a heterodimer 
and of co-purifying with the GroEL chaperonin (19). 
PvuRtslI is encoded by a single gene present on the kana- 
mycin resistance plasmid Rtsl originally isolated from 
Proteus vulgaris and its restriction activity in vivo was 
shown to be modulated by hm C glucosylation in a 
complex fashion (20). However, as PvuRtslI was not 
purified, its activity has not been characterized in vitro. 

Here, we show that purified recombinant PvuRtslI se- 
lectively cleaves hm C-containing DNA and determine its 
cleavage site. In addition, we present initial data on the 
use of C as a tool to investigate hm C patterns in mam- 
malian genomes. 



MATERIALS AND METHODS 

Cloning and purification of PvuRtslI 

The sequence encoding PvuRtslI was synthesized at Mr 
Gene GmbH (Regensburg) and cloned into the pET28a 
vector (Novagen). BL21(DE3) Escherichia coli cells 
carrying the expression vector were grown in LB 
medium at 37°C until A 600 = 0.6-0.7 and induced with 
1 mM isopropyl P-D-thiogalactopyranoside for 16 h at 
18°C. Lysates were prepared by sonication in 300 mM 
NaCl, 50 mM Na 2 HP0 4 pH 8.0, 10 mM imidazole, 10% 
glycerol and 1 mM P-mercaptoethanol, cleared by centri- 
fugation and applied to a nickel-nitrilotriacetic acid 
column (QIAGEN) pre-equilibrated with lysis buffer. 
Washing and elution were performed with lysis buffer con- 
taining 20 and 250 mM imidazole, respectively. Eluted 
proteins were applied to a Superdex S-200 preparative 
gel filtration column (GE Healthcare) in 150mM NaCl, 
20 mM Tris pH 8.0, 10% glycerol, ImM DTT and peak 
fractions were pooled. The stability of PvuRtslI upon 
storage was improved by supplementation with 10% 
glycerol. 

Preparation of DNA substrates 

In vivo a/P-glucosylated and non-glucosylated T4 phage 
DNA was isolated essentially as described (4). Briefly, 
T4 stocks were propagated on E. coli strain CR63, 
which was also used for the isolation of glucosylated T4 
DNA. To isolate non-glucosylated T4 DNA, wild-type T4 
phage was amplified on an ER1565 ga/U mutant strain. 
P-glucosylated T4 DNA was generated in vitro by treat- 
ment of non-glucosylated T4 DNA with purified T4 
P-glucosyltransferase (7). Genomic DNA was isolated 
from mouse cerebellum and triple knockout (TKO) em- 
bryonic stem cells (ESCs) (21) as described (7). 

Reference DNA fragments containing exclusively '""C, 
m C or unmodified cytosine residues were prepared by PCR 
using 5-hydroxymethyl-dCTP (Bioline GmbH), 
5-methyl-dCTP (Jena Bioscience GmbH) and dCTP, re- 
spectively. T4 phage DNA template, Phusion HF DNA 



Polymerase (Finnzymes) and primer 5'-GTG AAG TAA 
GTA ATA AAT GGA TTG-3', which does not contain 
cytosine residues, were used for amplification of all refer- 
ence DNA fragments. To generate the reference 11 39 bp 
fragment with 100% hm C for restriction with PvuRtslI the 
second primer was 5'-TGG AGA AGG AGA ATG AAG 
AAT AAT-3', which also does not contain cytosine 
residues. To generate the 800 and 500 bp control sub- 
strates containing only m C and only unmodified cytosine 
for restriction with PvuRTSlI the second primer was 5'- 
GCC ATA TTG ATA ATG AAA TTA AAT GTA-3' and 
5'-TCA GCA ATT TTA ATA TTT CCA TCT TC-3', 
respectively. PCR products were purified by gel electro- 
phoresis followed by silica column purification 
(Nucleospin, Macherey-Nagel). The 140 bp fragment 
used to determine the orientation of the PvuRTSlI 
cleavage overhang was amplified with primers 5'-TAT 
ACT GAA GTA CTT CAT CA-3' and 5'-CTT TGC 
GTG ATT TAT ATG TA-3'. 

For the preparation of substrates with a single PvuRtslI 
consensus containing hra C or m C in symmetrical or asym- 
metrical configuration a 94 bp fragment was amplified 
from the T4 genome with primers 5'-CTC GTA GAC 
TGC GTA CCA ATC TAA CTC AGG ATA GTT 
GAT-3' and 5'-TAT GAT AAG TAT GTA GGT TAT 
T-3'. This fragment contains a single site corresponding to 
the identified PvuRtslI consensus hm CN 11 _ 12 /N 9 _ 1 oG and 
was used as a template according to the strategy depicted 
in Figure 3. To generate substrates with symmetric 
cytosine modifications or unmodified cytosine the 
fragment was amplified with forward primer 5'-CTC 
GTA GAC TGC GTA CCA-3' and reverse primer 1 
5'-TAT GAT AAG TAT GTA GGT TAT T-3' in the 
presence of the respective modified or unmodified dCTP. 
To generate substrates with asymmetric cytosine modifi- 
cations the same forward primer was paired with reverse 
primer 2 5'-TAT GAT AAG TAT GTA GGT TAT TCA 
A-3'. 

DNA restriction with PvuRtslI and identification of 
cleavage and recognition site 

Unless otherwise stated the reaction conditions contained 
150mM NaCl, 20mM Tris pH 8.0, 5mM MgCl 2 , ImM 
DTT. One unit of PvuRTSlI was defined as amount of 
enzyme required to digest 1 ug of hm C-containing T4 DNA 
in 15min at 22°C. For assessment of enzyme specificity, 
100 ng of each control fragment were digested separately 
or together with 200 ng of genomic DNA in 30 ul reactions 
containing standard buffer and 1 U of purified PvuRtslI 
at 22°C for 15min. 

For identification of the cleavage and recognition site, 
the 11 39 bp fully hydroxymethylated fragment amplified 
from the T4 genome or whole non-glucosylated T4 
DNA were digested under standard conditions. 
Fragment ends were blunted with Klenow polymerase 
(NEB) and cloned using the Zero Blunt® PCR Cloning 
Kit (Invitrogen). Randomly selected clones were 
sequenced and the data were analyzed using 
WebLogo (22). 
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RESULTS 

hm C-speciflc endonuclease activity of PvuRtslI 

His-tagged PvuRtslI was expressed in E. coli and purified 
to homogeneity by sequential Ni 2+ affinity and size exclu- 
sion chromatography (Figure 1A). As bacteria carrying 
the Rtsl plasmid were shown to restrict the 
hm C-containing T-even phages, but not m C-containing 
T-odd phages or X phage, which does not contain 
modified cytosine (20), we initially used T4 genomic 
DNA as a substrate to test the activity of purified 
PvuRtslI. T4 genomic DNA was isolated from both 
galU^ and galU~ strains, the latter being UDP-glucose 
deficient and thus containing only non-g lucosylated hm C. 
Under the same digestion conditions non-glucosylated T4 
DNA was digested more efficiently than both naturally a- 
and P-glucosylated and in vitro p-glucosylated counter- 
parts (Figure IB). Non-glucosylated T4 DNA was 
cleaved into fragments with an apparent size of about 
200 bp, indicating that PvuRtslI recognizes a frequently 
occurring sequence (Figure IB and Supplementary 
Figures SI and S2). We then used non-glucosylated T4 
DNA to test the activity of the enzyme under various con- 
ditions. PvuRtslI was strictly dependant on Mg 2+ ions, 
which could not be substituted with Ca 2+ , and endonucle- 
ase activity was maximal in the presence of 100-200 mM 
NaCl (Supplementary Figure SI A and B). However, 
during purification we observed that the enzyme is 
unstable in solutions of ionic strength lower than 
150mM NaCl. The activity of PvuRtslI was found 
highest at pH 7.5-8.0 and was unaffected by the 
presence of Tween 20 or Triton X-100 (Supplementary 
Figure S2A and B). We also observed that after prolonged 
incubation PvuRtslI precipitates even at room tempera- 
ture, consistent with the reported temperature sensitivity 
of the phage restriction activity in cells carrying the Rtsl 
plasmid (20). Upon short incubation times maximal 
activity was observed at 22°C (Supplementary Figure 
2C). Thus, the relative amounts of enzyme and DNA sub- 
strate were standardized so that digestion was complete in 
15min at 22°C in the presence of 150mM NaCl 
(Supplementary Figures SIC and S2C). 

The specificity of PvuRtslI with respect to cytosine 
modification was further tested by digesting reference 
fragments containing exclusively unmodified cytosine 



(500 bp), m C (800 bp) or hm C (1139 bp; Figure 1C). 
Under standard digestion conditions purified PvuRtslI 
selectively cleaved the hm C-containing fragment, consist- 
ent with the relative restriction efficiency of bacterio- 
phages with distinct cytosine modifications by bacteria 
carrying the Rtsl plasmid (20). 

Determination of PvuRtslI cleavage sites 

To identify the cleavage pattern of PvuRtslI we generated 
libraries of restriction fragments from either the whole T4 
genome (Supplementary Figure S3) or an 1139 bp 
fragment amplified from the same genome containing ex- 
clusively hydroxymethylated cytosines (Figure 2). 
Random sequencing of 161 and 133 fragment ends from 
the whole T4 genome and 11 39 bp fragment libraries 
revealed that 85 and 89%, respectively, matched the con- 
sensus sequence hm CNn_i 2 /N9_i 0 G. Among these 78 and 
87%, respectively, showed one of three similar sequence 
patterns, hm CN 12 /N 10 G, hm CN 12 /N 9 G and hm CN n /N 9 G, 
while for the remaining fragment ends the exact number of 
nucleotides between the modified cytosine and the 
cleavage site could not be determined unambiguously 
due to the occurrence of multiple C residues upstream 
of the cleavage site. Of the sequenced fragment ends, 14 
and 11% from the whole T4 genome and 1139 bp 
fragment libraries, respectively, did not match the 
hm CN„_ 12 /N 9 _ 10 G consensus. However, 100 and 80% of 
these ends, respectively, contained at least one hm C residue 
10-13 nt upstream of the cleavage site, while no guanine 
was present in the T4 genomic sequence 10-1 1 nt down- 
stream the cleavage site (Supplementary Figure S4). The 
sequenced clones from the 1139 bp T4 genomic fragment 
library corresponded to an 8 1 % coverage of the fragment, 
with some PvuRtslI fragments occurring multiple times, 
while other fragments that were predicted on the basis of 
the hm CN n -i 2 /N9„ 10 G consensus were not found (Figure 2 
and Supplementary Figure S5). Examination of the 
missing fragments did not show any common sequence 
feature beyond the hm CN n _ 12 /N9_ 10 G consensus 
(Supplementary Figure S6), suggesting that their absence 
from the sequenced fragments was due to limited 
sampling. Alignment of sequenced fragment ends from 
the T4 genomic fragment library showed that 2nt 
around the cleavage site were missing from all clones, sug- 
gesting a 2nt 3'-overhang cleavage pattern 




Figure 1. Selective restriction of hm C-containing DNA by PvuRtslI. (A) Purified PvuRtslI was resolved on a SDS-polyacrylamide gel and stained 
with coomassie blue. (B) T4 genomic DNA with the naturally occurring pattern of a- and P-glucosylated hm C, only P-glucosylated hm C or 
non-glucosylated hm C was incubated without or with decreasing amounts of PvuRtslI as indicated. (C) Reference PCR fragments of 1139, 800 
and 500 bp containing hm C, m C and unmodified cytosine at all cytosine residues, respectively, were incubated with or without PvuRTSlI as indicated. 
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Figure 2. Cleavage site of PvuRtslI. A library of PvuRtslI restriction fragments was generated from an 1139 bp PCR fragment containing only 
hydroxymethylated cytosine residues and the sequence of 133 restriction fragment ends from randomly chosen clones was determined. (A) Graphical 
map of the fragment ends. A total of 119 analyzed fragment ends (triangles) matched the consensus sequence """CNn-ij/Ng-ioG, which was present 
at 97 sites (thin vertical lines) in the 11 39 bp PCR fragment (thick horizontal line). Fifty three fragment ends related to the sequence motif hm CN 12 / 
N 10 G (dark green triangles), 37 to hm CN n /N 10 G (bright green triangles) and 14 to hm CN n /N 9 G (light green triangles), while 15 fragment ends 
matching the consensus sequence hm CN 11 _ 12 /N 9 _ 10 G could not assigned unambiguously to any of these subsets (gray triangles). Fourteen fragment 
ends did not match the prevalent consensus sequence (gray circles, see Supplementary Figure S3). (B) Occurrence of the three subsets of cleavage sites 
and LOGO representation of the corresponding consensus sequence. The absolute height of each position reflects its overall conservation, while the 
relative height of nucleotide letters represents their relative frequency. The slash in the three cleavage sequence subtypes indicates the exact cleavage 
site. 



(Supplementary Figure S5). This was confirmed by direct 
sequencing of the two fragments generated by digestion of 
a 140 bp amplicon containing a single PvuRtslI site 
(Supplementary Figure S7). 

The results above reveal a symmetric nature of the 
preferred cleavage sites and raise the issue of PvuRTslI 
activity on sites with modified cytosine in symmetric and 
asymmetric configuration. To clarify this issue, we used a 
PCR strategy to generate DNA substrates with identical 
sequence and containing a single PvuRtslI consensus site 
with hm C or m C in symmetrical and asymmetrical config- 
urations or no modified cytosine (Figure 3A). In the 
presence of enzyme amounts that did not cleave substrates 
with unmodified and m C sites, digestion of substrates with 
asymmetric hm C at the PvuRTslI site was reduced with 
respect to substrates with symmetric C, but still appre- 
ciable. Residual undigested substrate with symmetric m C 
at the PvuRTslI site in these reaction conditions was typ- 
ically observed with such short substrates, but not with 
longer ones. 



Digestion of mammalian genomic DNA with PvuRtslI 

To investigate cleavage site preference and efficiency of 
PvuRtslI digestion for mammalian genomic DNA, we 
initially selected the upstream regulatory region III of 
the mouse nanog gene (23). As this region was shown to 
be bound by Tetl and to acquire CpG methylation upon 
knockdown of Tetl in ESCs (5), it represents a potential 
candidate as a mammalian genomic sequence containing 
hm C. Real time amplification of this region from ESC 
genomic DNA did not show a significant decrease of 
product after PvuRtslI digestion (data not shown). We 
then devised a strategy to positively identify rare 
PvuRtslI digestion products. After PvuRtslI digestion 
genomic fragments were ligated to a linker with a 
random 2nt 3'-overhang. Ligation products were then 
amplified using nanog specific primers paired with a 
linker specific primer, but no amplification product 
could be obtained (data not shown). This result may be 
explained by an extremely seldom occurrence of hm C at 
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Figure 3. Differential activity of PvuRtslI on sites with symmetric and asymmetric m C. Ninety-four bp long substrates with identical sequence were 
generated that contain a single PvuRtslI consensus site (CNi 2 /N 10 G) with '""C or m C in symmetrical and asymmetrical configurations or no modified 
cytosine. (A) Strategy for generation of the substrates by PCR amplification in the presence of modified nucleotides. The size of the PvuRtslI 
digestion products is indicated. (B) The variously modified substrates were digested with the indicated amounts of PvuRtslI and digestion products 
were resolved on polyacrylamide gels. Note the reduced but tangible digestion of the substrate containing asymmetric C. 



cleavage sites of this locus (especially in symmetric config- 
uration), inefficiency of PvuRtslI digestion or both. In 
this regard, it is important to consider that positive iden- 
tification of C sites in this region of the nanog locus has 
actually not been reported for ESCs. In addition, during 
the revision of the present work a manuscript was pub- 
lished (24) that could not confirm the reduced nanog ex- 
pression and ESC differentiation previously reported 
upon Tetl knockdown (5), raising uncertainty about the 
actual occurrence of hm C at the nanog promoter in ESCs. 

As there are no clear and quantitative data on the levels 
and density of hm C at specific genomic sites available yet 
we generated defined substrates to validate the PvuRstll 
cut-ligation amplification protocol for the identification of 
hm C sites. We PCR amplified region III of the nanog 
promoter in the presence of increasing concentrations of 
5-hydroxymethyl-dCTP and confirmed the incorporation 
of proportional levels of hm C using the recently reported 
fi-glucosylation assay (7) (data not shown). Fragment 
samples with increasing C content were then digested 
with PvuRtslI and the same ligation/PCR strategy for the 
identification of digestion products was applied as 
described above (Supplementary Figure S8A). Detection 
of fragments with ends corresponding to the PvuRtslI 
cleavage pattern raised with increasing hm C content. 

We previously quantified global hm C levels in genomic 
DNA from ESCs and adult somatic tissues using in vitro 



hm C glucosylation (7). Consistent with other studies 
(3,6,8,9), this analysis revealed that genomic DNA from 
adult brain regions has a high C content. In addition, 
we showed that in ESCs that are TKO for all three major 
DNA methyltransferases Dnmtl, 3a and 3b (21) genomic 
hm C levels were around the estimated limit of detection, 
although reproducibly above background. Therefore, we 
compared the PvuRtslI restriction pattern of genomic 
DNA from cerebellum and TKO ESCs as representative 
of samples with high and very low hm C levels, respectively. 
As internal controls, we co-digested each of the two 
genomic DNA samples with the same reference fragments 
as used to test the specificity of PvuRtslI with respect to 
cytosine modification (Figure 1C). As expected from the 
relative low abundance of hm C in mammalian genomic 
DNA, there was a limited reduction of high molecular 
weight fragments and appearance of lower molecular 
weight smear (Figure 4). However, DNA from cerebellum 
was clearly digested to a higher extent than DNA from 
TKO ESCs as evident from the line scans across the re- 
spective gel lanes (Figure 4). The low but appreciable 
degree of digestion observed for genomic DNA from 
TKO ESCs does not seem to result from relaxed specificity 
or contaminating nuclease activities, as only control sub- 
strates containing hm C, but not m C or unmodified 
cytosine, were digested when incubated either separately 
or together with genomic DNA (Figure 1C and Figure 4). 



5154 Nucleic Acids Research, 2011, Vol. 39, No. 12 



Cerebellum 



TKO ESCs 



gray values (x1000) 

30 25 20 15 10 5 0 



PvuRtsll 



gray values (x1000) 

0 5 10 15 20 25 30 




- PvuRtsI I 
+ PvuRtsI I 



Figure 4. Restriction of mouse genomic DNA by PvuRtsll reflects lm C content. Genomic DNA from mouse cerebellum or TKO ESCs was mixed 
with three reference PCR fragments of 1139, 800 and 500 bp containing hm C, m C and unmodified cytosine at all cytosine residues, respectively, and 
incubated with or without PvuRtsll as indicated. Digests were resolved on a 0.8% agarose gel stained with ethidium bromide. Line scans of the gel 
lanes are aligned to the image of the gel. Red and blue lines correspond to samples incubated with and without enzyme, respectively. Arrows point to 
the main difference in the profiles form cerebellum and TKO ESC DNA digested with PvuRtsll (red lines). 



Absence of digestion of control substrates containing m C 
and unmodified cytosine was evident from the unaltered 
ratio of their respective signals in the presence and absence 
of enzyme. This result shows that the extent of digestion 



by PvuRtsll reflects the relative 
lian genomic DNA. 



second strand synthesized with m C nucleotides to cut 
and reveal the likely more abundant hemimodifled 
PvuRtsll sites. 

Notably, while cerebellum has been previously reported 

hlTly- 



DISCUSSION 

Several modification and restriction systems have evolved 
as defense and counter defense strategies in the struggle 
between unicellular microorganisms and their viruses. 
Here, we show that, in contrast to previously 
characterized endonucleases which cleave hm C-containing 
sequences, PvuRtsll has a preference for the 
non-glucosylated form of this base and discriminates 
against m C. This specificity makes PvuRtsll an attractive 
tool to investigate genomic hm C patterns in higher eukary- 
otes and complements the very recently published 
methods for enzymatic labeling of this sixth base (7,13). 

Importantly, we show that the extent of PvuRtsll di- 
gestion reflects the relative abundance of hm C in genomic 
DNA from cerebellum and TKO ESCs. The limited extent 
of digestion even for samples with relatively high hm C 
content is in line with the cleavage site preference and 
dependence on cytosine modification that we determined. 
We calculate that the statistical probability of the 
PvuRtsll consensus site CNii-i2/N9_ioG in the mouse 
genome is 0.126. Combined with the global hra C occur- 
rence in mouse tissues (up to 0.13% of all bases or 
0.65% of Cs) (3,7-9) this translates into a PvuRtsll 
cleavage site every 1.9 x 10 5 bases. As this is in the size 
range of fragments typically obtained with standard pro- 
cedures for isolation of genomic DNA, more careful iso- 
lation methods should be used and/or PvuRtsll specific 
ends could be enriched by ligating biotinylated PvuRtsll 
compatible linkers. Alternatively, digestion conditions 
could be optimized or DNA could be denatured and a 



"C content in mamma- among the tissues with the highest levels of genomic m C 

(3,7,9), complete absence of m C and therefore hm C would 
be expected in TKO ESCs due to the lack of all three 
major Dnmts (21). However, we previously detected hm C 
levels slightly above background in TKO ESCs (7) and 
here we show minimal but appreciable digestion by 
PvuRtsll. In this context, it is interesting to note that 
ESCs express the highly conserved Dnmt2 (25,26), the 
only Dnmt family member with an intact catalytic 
domain that has not been genetically inactivated in TKO 
ESCs. Although Dnmt2 has a major role as a tRNA 
methyltransferase and its function as a DNA 
methyltransferase is still debated (27-32), it was recently 
shown to methylate genomic sequences in Drosophila 
(32,33). Future work should clarify whether the genome 



of TKO ESCs harbors any residual m C and hm C. 

Restriction of genomic DNA with PvuRtsll may be 
combined with PCR amplification for analysis of specific 
loci or with massive parallel sequencing or microarray hy- 
bridization for genome-wide mapping. The calculations 
reported above for the frequency of PvuRtsll cleavage 
sites based on a random C distribution bring up the 
argument that the extent of random breaks in genomic 
DNA preparations would contribute very significant 
noise in deep sequencing and microarray applications. 
This drawback may at least be partially overcome if 
specific PvuRtsll ends are enriched by ligating linkers 
with a random 2nt 3'-overhang as described here and dis- 
cussed above, a strategy that can be integrated with pro- 
cedures for generation of sequencing libraries. Also, our 
simulation of genomic fragments containing known levels 
of randomly distributed C clearly shows that relatively 
high local concentrations of hm C sites are required for ef- 
ficient detection by PvuRtsll. The first genome-wide hm C 
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profiles from mammalian tissues have just been reported 
(13). From these first data sets, it is apparent that genomic 
hm C is not randomly distributed and that its accumulation 
in gene bodies is proportional to transcriptional activity. 
Thus, PvuRtslI may prove a valuable tool to probe hm C 
accumulation at defined genomic regions. In addition, the 
selectivity of PvuRtslI for hm C-containing sites may con- 
stitute an advantage with respect to endonucleases such as 
McrBC and MspJl as these enzymes do not discriminate 
between m C and hm C and require in vitro enzymatic hm C 
glucosylation to specifically protect hm C-containing sites 
from digestion and thus distinguish them from m C sites. 

In conclusion, we show that PvuRtslI is an hm C specific 
endonuclease and provide a biochemical characterization 
of its enzymatic properties for future applications as diag- 
nostic tool in the analysis of hm C distribution at genomic 
loci in development and disease. 
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